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Abstract. Grammar inference deals with determining (preferable sim- 
ple) models/grammars consistent with a set of observations. There is a 
large body of research on grammar inference within the theory of for- 
mal languages. However, there is surprisingly little known on grammar 
inference for graph grammars. In this paper we take a further step in 
this direction and work within the framework of node label controlled 
(NLC) graph grammars. Specifically, we characterize, given a set of dis- 
joint and isomorphic subgraphs of a graph G, whether or not there is a 
NLC graph grammar rule which can generate these subgraphs to obtain 
G. This generalizes previous results by assuming that the set of isomor- 
phic subgraphs is disjoint instead of non-touching. This leads naturally 
to consider the more involved "non-confluent" graph grammar rules. 



1 Introduction 

Grammar inference, also called grammar induction, is a general line of research 
where one is concerned with determining a "simple" grammar that is consistent 
with a given set of possible and impossible outcomes. Hence, one "goes back" in 
the derivation: instead of determining the generative power of a grammar, one 
determines the grammar given the generated output. This topic is well-studied 
for formal languages, especially with respect to context-free languages, see e.g. 
[6,4], however, relatively little is known for graph grammars. 

The topic of inference of graph grammars is considered in [5] and uses their 
so-called Subdue scheme developed in [2] . In [1] a rigorous approach of grammar 
inference within the framework of node label controlled (NLC) graph grammars 
[3], a natural and well-studied class of graph grammars, is initiated. There it is 
characterized, given a set S of non-touching isomorphic graphs of a graph G, 
whether or not there is a graph grammar consisting of one rule able to generate 
the graphs of S to obtain G. We continue this research and generalize this result 
for the case where these graphs are disjoint instead of non-touching. Such a 
generalization requires one to deal with a number of issues. Most notably, one 
has to deal with non-confluency issues: the generated graph depends on the order 
in which touching subgraphs are generated. 



2 Notation and Terminology 

We consider (simple) graphs G = (y, E) , where V is a finite set of nodes and 
E C \ x,y G V,x ^ y} is the set of edges lience no loops or parallel 

edges are allowed. We denote V{G) = V and E{G) = E. For 5 C y, the induced 
subgraph of G is {S, E') where E' C E and for each e E E wc have e E E' iS 
e C S*. We consider only induced subgraphs, and therefore we sometimes just 
write "subgraph" instead of induced subgraph. The neighborhood of 5 C y in 
G, denoted by Ng{S), is {v e V\S \ {s,v} e E for some s £ 5}. If 5 = {x} 
is a singleton, then we also write Ng(x) = Na{S). A labelled graph is a triple 
G = (y, E. I) where {V, E) is a graph and / : y ^ L is a node labelling fmiction, 
where L is a finite set of labels. As usual, graphs are consider isomorphic if they 
are identical modulo the identity of the vertex. It is important to realize that 
for labelled graphs, vertices identified by an isomorphism have identical labels. 
In graphical depictions of labelled graphs we will always represent the vertices 
by their labels. 

Subgraphs Gi, and G2 are called disjoint if y(Gi) and y(G2) are disjoint. 
They are called touching if y(Gi) U Ng{V{Gi)) and y(G2) U Ng{V{G2)) are 
not disjoint. 

Define, for disjoint W\,W2 C y, Kwi,w^ = {{xi,X2) \ xi G Wi,X2 & W2} to 
be the set of all tuples with the left element from Wi and the right element from 
W2. Define u{{xi,X2)) to be the underlying set {xi,X2}, and define TTi{{xi,X2)) = 
Xi for i G {1, 2}. Often, for fimction f : X —>■¥ we write /(£>) = {/(a;) \ x G D} 
for DCX. 

3 NLC Graph Grammars 

Typically, a graph grammar transforms a graph G by replacing an (induced) 
subgraph H by another graph H' where H' is embedded in the remaining part 
G\H of the original graph in a way prescribed by a so-called graph grammar 
embedding relation. The node label controlled (NLC) graph grammars are the 
simplest class of these grammars, where ff is a single node. Note that for the 
grammars the exact identities of the nodes are not important as multiple copies 
of H' may be inserted. Hence, we consider labelled graphs where the embedding 
relation is defined w.r.t. node labels instead of nodes. In this section we recall 
informally the notions and definitions concerning NLC grammars used in this 
paper, and refer to [3] for a gentle and more detailed introduction to these 
grammars. 

A NLC graph grammar is a system Q consisting of a set of node labels L, an 
embedding relation E C L^^ and a set of productions P where a production is of 
the form iV — > S* where N G L and 5 is a (labelled) graph. In this paper we will 
focus on the case |P| = 1. Hence Q can be denoted as a rule r = N ^ S/E (if 
L is understood from the context of considerations). Given a graph G, r can be 
applied to any node v labelled by N. The result of applying r to in G is that 
V is removed from G along with the edges adjacent to v, and (a copy of) S is 
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added to G, and an edge e = {x, y} is added to G iff a; € V{S), y e Ng{S) and 
G E (recall that I is the labelling function). To avoid confusion with 
embedding relations, the set of edges of a graph G are written in the remainder 
as E{G) and not as E. 




a a —t a a a a 




Fig. 1: The derivation of a graph G (left-hand side) to G' (right-hand side). 



Example 1. Let G be the graph on the left-hand side of Figure 1. Consider the 
grammar rule r = N ^ S/E', where S is the graph 



b c 



and E' = {(a, 6), (6, a), (c, c), {a,N), {c,N)}. (Note that formally we have only 
defined S up to isomorphism, however as we have seen this is not an objection.) 
Then Figure 1 depicts one possible derivation from G to a graph G' (on the 
right-hand side of the figure) for which no rule is applicable anymore. Note that 
there is one other possible derivation to a "terminal" graph G" (i.e., a graph 
without vertices labelled by N): to obtain G" we choose first the right-hand 
vertex labelled by N (the one not connected to the vertex labelled by b) in G in 
the derivation. Note that G' and G" are different graphs. We assume that the 
set of labels L is {a,b,c, N}. This example will be our running example of this 
paper. 

In [1] the inference of NLC grammars with exactly one rule r = N —>■ S/E 
are studied where moreover S docs not contain a vertex labelled by N and E 
does not contain a tuple containing N. This is sufficient for the case where the 
subgraphs isomorphic to S are non-touching. To consider the case where the 
subgraphs arc disjoint, we allow E to contain tuples containing N. However, we 
do require that S remains without vertices labelled by N. Therefore there is no 
"real" recursion: no vertices labelled by N can be introduced in any derivation. 
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4 Known results: Non-touching graphs 

In this section we recall some notions and a result from [1] which we will need in 
subsequent sections. First we define in this context the notion of compatibility. 

Definition 2. Let G be a graph and S be an induced subgraph of G. We say 
that E C L X L is compatible for S (in G) if there is a graph F such that an 
application of NLC grammar rule N S' / E to F "creates" S and obtains graph 
G. Note: S' is (isomorphic to) S. 

Example 3. Reconsider our running example. Hence we again let G' be the graph 
at the right-hand side of Figure 1. Moreover we let Si and 52 be the subgraphs 
of G' of the form 

a a 

b c 

where is the one connected to the a vertex labelled by b and is the other 
one. Note that Si and 52 are disjoint and touching in G'. We have that, e.g., 
El = {(6, a), (c, c)}, E2 = {(6, a), (c, c), (a, b)] or E^ = {(6, a), (c, c), (c, 6), (6, b)} 
is compatible for 52 in G' . The middle graph of the figure is a graph F such 
that an application of the NLC grammar rule N — > S/E to F "creates" 52 and 
obtains graph G' . 

To characterize the notion of compatibility, the notions of inset and outset 
for arbitrary Q CV'^ (where V is the set of edges of G) are crucial. 

Definition 4. Let Q C F^, and let Pq = {InQjOutg} be the partition of 

Q where, for .x 6 Q, a; G Iuq iff u{x) £ E{G). We define the elements of 
{/(/tiq), l{OutQ)} as the inset, denoted by Iq, and outset, denoted by Oq, of Q, 
respectively. 

Let 5 be an induced subgraph of G. Then the inset {outset, resp.) of 5, denoted 
by Is {Os, resp.), is defined to be the inset (outset, resp.) of Q = Kv{s).Nq{V{s))- 
The following lemma, given and proven in [1], characterizes compatibility for 
a single graph 5 in terms of the inset and outset of 5: the inset are tuples that 
should be in E, while the outset are tuples that should not be in E. 

Lemma 5. Let 5 be an induced subgraph of G, and let E C L x L. Then E is 

compatible for S iff Is Q E C L^\Os (i.e., E separates Is from Os )■ 

Hence, there is a compatible E for 5 in G iff fl Os = 0. 

Example 6. Reconsider again our running example. Then = {(6, a), (c, c)} 
and = {{a, a), {a, c), (c, a), {b, c)} (w.r.t. G'). Since /g^ ^ = 0> there is a 
compatible E for 52 in G'. We have that Is2 ^ E C L?\0s2 holds for, e.g., Ei, 

E2 and in Example 3. 

We consider now sequences of subgraphs to be generated by a single graph 
rule. Note that these graphs must necessarily be mutually isomorphic. 
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Definition 7. Let G be a graph and Si, S2, ■ ■ ■ , Sn be induced subgraphs of G 
isomorphic to S. Wo say that E C Lx L in compatible for {Si, S2, ■ ■ ■ , Sn) (in G) 
if there are graphs Go, ■ ■ ■ ,Gn such that Gn = G and for each i e {1, . . . , n}, Gi 
is obtained from Gi_i by applying NLC grammar rule N ^ S/E that "creates" 
Si. 

Note that, in general, the order of the elements (5*1, 5*2, . . . , 5„) is important. E.g. 
a given E may be compatible for {Si,S2) while it is incompatible for {S2,Si) 
(we will see such an example in the next section). 

However, for a set of mutually non-touching and isomorphic subgraphs Si 
for i G {1, . . . ,n} of G, the order of the elements is not important. Thus, E C 
L X L compatible for C = (5i, S'2, . . . ,Sn) implies that E is compatible for any 
permutation of C. In fact we have that, E C L x L is compatible for Si, for S2, 
. . ., and for Sn iff it is compatible for G (or any permutation of G). Therefore, 
in this case, Lemma 5 is trivially generalized: E C L x L is compatible for 
{Si, S2,..., Sn) iff Uils, CEC L''\{UiOs,) (as noted in [1]). 

5 Two touching graphs 

In this section wc consider the case where a single NLC grammar rule N —>■ S/E 
generates disjoint subgraphs which can (possibly) touch each other. Hence, this 
generalizes Lemma 5 by replacing the non-touching condition into disjointness. 
To this aim we allow non-terminal N to be present in tuples of the embedding 
relation E of NLC grammar rule A'" — > S/E. This introduces the issue of con- 
fluency: the order in which non-terminals are replaced by subgraphs influences 
the obtained graph. Example 1 illustrates this as the different graphs G' and G" 
can both be obtained from the original graph G. 

As we will see the inset and outset between the vertices of two touching 
graphs turns out to be crucial. 

Definition 8. Let Si and S2 be touching graphs in G. For Qi = Kv(S2),viSi)nNG{S2)^ 
we denote Iq^ and Oq^ by /(Si,S2) ^^'^ 0(Si^S2)^ respectively. Moreover, for 
Q2 = Kv(S2),v(Si), we denote Iq^ and Oq^ by I((Si,S2)) and 0((Si,S2))> respec- 
tively. 

We now state some basic properties of the insets and outsets of Definition 8. 
Note first that I(^Si,S2) — -^((Si,S2))- ^^ct, it is equal to the inset of 

Kv{S2)nNG{Si),v{Si)nNG{S2)- 

Also note that, for node labels x and y, we have {x,y) G I{{Si.S2)) iff (Ut^) G 
I{{S2,Si)) - This holds similarly for 0(^(Si,S2))^ however, this does not hold in gen- 
eral for 0(5^ 5^). Moreover note that 0(^Si.S2) — ^'((Si.Ss))) and 

0((SuS2))\0{Si,S2) = KKv(S2),V{Si)\NGiS2))- 

Finally note that '!r2{I {81,82)) = K^i^i) ^ Ng{S2))- We will use these basic 
properties frequently in the remainder of this paper. 
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Example 9. In our running example, we have I{Si,S2) = {(^' (^i c)}, 0(5^^53) = 
{{a, a), (a, c), (6, c), (c, a)}, and 0((Si,S2)) = -^'^\Asi,S2) with L' = {a, 6, c}. More- 
over, we have /(s^^Si) = {(a, 6), (c, c)}, 0(S2,Si) = {(a, c), (6, 6), (6, c), (c, 6)}, and 

We now adapt the definition of inset and outset for a graph S, by incorpo- 
rating the issues related to touching graphs. 

Definition 10. Let ^i, . . . , 5„ be distinct subgraphs of G, and let Q = Uig{i_ „} 
Kv{s,)MG{v{Si))\iUjsi^^,.„nyV{Sj))- We denote 7q and Oq by /[s,,...,s„] and 0[s:j,...,s:„], 
respectively. 

Note that I[Si,S2] ~ ^[82,81] and if and S2 are non-touching, we have 

hSi,S2] = ^Sl U Is2- 

Example 11. In our running example, we have Ii^g^ s^] = {('^j ^)}) a-^id '^[Si S2] ~ 
{{h,b),{c,h)}. 

Definitions 8 and 10 are to separate three types of insets and outsets. Roughly 
speaking, the two types of insets and outsets of Definition 8 deal with the tuples 
between 5*1 and ^2, while the type of inset and outset of Definition 10 deals with 
the tuples from to the "outside world" (the vertices in the neighborhood of 
5*1 which do not belong to S2) plus the tuples from 52 to the "outside world" 
(the vertices in the neighborhood of 5*2 which do not belong to S-[). 

We now characterize the embedding relations E such that E is compatible 
for {81,82) where and ^2 are touching subgraphs of G. 

Lemma 12. Let Si and S2 be touching subgraphs of G. Then E C L x L is 
compatible for {Si, 82) iff the following conditions hold: 

1- I(SuS2) ^ 

2. {{x,N)\xe^2{ks^,S2))]'^E, 

3. If e e 0((5j S2))> ^'is" either {■K2{e),N) ^ E or e ^ E (or hath), and 

4. I[s,,S2]^ECL^\{0[s,,s2])- 

Moreover, if this is the case, then we have E n 0(^Si,S2) — ^■ 

Proof. In the case where there are no edges between Si and ^2, we have, by 
Lemma 5, that E C L x L is compatible for {81,82) iff Isi U /g^ C C 
L^\{Osi U OS2) - this is equivalent to condition (4) (since 81 and ^2 are non- 
touching). 

Now, since edges between and 82 can only introduce additional constraints 
on E (i.e., not less constraints), we may consider only the graph F = N — N, a,n 
edge having two vertices labelled by TV, and check the necessary and sufficient 
(additional) constraints on E to transform the graph in two steps where .S*! 
appears first and then ^2 such that the edges between 81 and ^2 are identical 
to those between 5i and ^2 in G. 

Now, let a; be a vertex of labelled by b, and y be a vertex of 5*2 labelled by 
a. Assume first that x is connected to y in G. Now, if we apply the NLC rule to 



6 



create ^i, then x should be connected to N - thus we need {b, N) G E. Indeed, 
without this rule x wih not be connected to any vortex of ^2 (after applying 
the NLC rule to create S2)- Now, if we subsequently apply the NLC rule to 
create S2, then y should be connected to x and hence we need (a, b) S E. Hence 
(a, b) € E and (6, N) £ E results in an edge between x and y. Conversely, if 
either (6, N) ^ E or (a, b) ^ E, then x is not connected to y. Consequently, both 
(a, b) G E and (6, N) € E iS there is an edge between a/every vertex labelled by 
b in Si and a/every vertex labelled by a in 5*2. 

Thus, E is compatible for {81,82) iff /((Si,s2))U{(a:, N) \ x e TT2{Ii(Si,S2)))} ^ 
E and both (a, 6) G E and (6, iV) e i? implies (a, 6) ^ 0((Si,s2))- 

Finally, we have in this case Er]0(^Si,S2) = ^- Indeed, if e G Er\0(^Si,S2)^ i^cn 
e G 52)) and e G _E and therefore, by condition (3), (7r2(e),iV) ^ i?. Now, 

7r2(0(Si,sl)) C 7r2(/(Si,S2)) = ^^(-^i) n VVg(52)) cf. Definition 8. Consequently, 
by condition (2), {Tr2{e),N) G E - a. contradiction. □ 

Intuitively, condition (4) Lemma 12 deals with the edges of 5i and ^2 to 
the "outside world", while conditions (1) to (3) deal with the edges between 
and 82- Conditions (1) and (2) state the tuples that must necessarily be in E, 
while condition (3) states requirements on which tuples must not (together) be 
in E. 

Since E r\0(^Si,S2) = by Lemma 12, we may modify conditions (1) and (3) 
of the previous lemma as follows: 

I(Si,S2) C C L'^\0(s^^s2)^ 
3'. If e G 0((Si,52))\C(5i,S2) = KKviS2),v(Si)\NG(S2))^ then either (772(6), A^) ^ 
£; or e ^ £; (or both). 

However, in this way the condition E fi 0(^Si,S2) = is explicitly assumed and 
not part of the result as stated in the lemma. 

Remark 13. By condition (4.) of the lemma, we may go even further and instead 
state "e G l{Kv{S2),v{Si)\NG{S2))\0[Si,S2f in condition (3). Therefore in prac- 
tise may be easier to check condition (3) if one considers only the (smaller) set 

0{{Si,S2))\iO{St,s2) U 0[s^^s2]) = KKv{S2)y(Si)\Na{S2))\0[Si,S2]- n 

Also note that, we have, for e G /[Si.Sa] ^ ^{81,82) (^-nd hence e C E), e G 
C((Si,S2)) implies (7r2(e),iV) ^ E. 

Example 14- We continue our running example. As we have seen, an E C L x L 
compatible for (5i, ^2) in G' allows, given the graph G on the left-hand side of 
Figure 1, for the generation of the middle graph (in the figure) and subsequently 
the generation of G' . We will now determine, using Lemma 12 and the modified 
conditions below the lemma, the constraints on E for it to be compatible for 
(5*1, ^2). 

Recall that /(Si,S2) = {(&,«), (c, c)}, 0(^Si,82) = {(«, a), (a, c), (6, c), (c, a)}, 
^[81,82] = {{a, b)}, and 0[s,,s2] = {{b, b), (c, b)}. Moreover, {{x, N) \ x e n2{I(Si,82)) 
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= l{V{Si)nNG{S2))} = {{a,N), {c,N)}. Hence, by conditions (1'), (2), and (4) 
of Lemma 12 we have 

{{a,b),ib,a),{c,c),ia,N),{c,N)}CE 

and 

E n {(a, a), (a, c), (6, b), (6, c), (c, a), (c, fe)} = 0. 

Now, 0((Si,S2))\0(Si,S2) = KKv{S2),V{Si)\Ng(S2)) = KK{a,b,c},{b}) = 6), 

{b, b), (c, 6)}. Hence by condition (3') either (6, N) ^ E oi (a, 6) ^ iJ. The latter 
is a contradiction, hence (6, -/V) ^ Consequently, 

E = {{a,b),{b,a),{c,c),{a,N),{c,N)} 

is compatible for (^i, 52), in fact, in this case, it is the unique E such that it is 
compatible for (Si,52) in G'. Note that adding {b,N) to E would indeed make 
it incompatible the generated graph would then have edges from the vertex 
labelled b in to the two vertices labelled a in Also note that this E is not 
compatible for {S2, Si) in G'. 

Using Lemma 12, the existence of an embedding relation E is elegantly char- 
acterized, as shown in the next lemma. 

Lemma 15. Let Si and S2 be touching graphs. There is a compatible E C 
LxLforiSi,S2) i#(%,s2]U/(Si,s2))nO[Si,S2] = 9, tt2{I{SuS2))^MI[SuS2]^ 
^((Si,S2))) = '^^'^ ^(81.82) ^ 0((5i,S2)) = 0- Moreover, if this is the case, then 
{I[Si,S2] U I{Si,S2)) n 0(Si,s2) = 0- 

Proof. Assume first that that right-hand side holds. Then take E' = I[Si.S2] ^ 
/(S,,S2h take F' = 7r2(/(s„s2)), and let E = E' U {{x,N) \ x G F'}. Now, 
conditions (1), (2), and (4) of Lemma 12 hold trivially. Finally to prove condition 
(3), we need to show that e £ 0((Si.S2)) ^ ^ implies (7r2(e), A'') ^ E. Let e G 
0((Si,S2)) ^ ^- We have, by definition of E, that e € I[Si,S2] or e G I(Si,S2)- The 
latter is a contradiction of I(^Si,S2) ^ <-^((Si,S2)) — ^- The former implies, lay the 
second equation of this lemma, that 7r2(e) ^ 'jt2{I(Si S2)) — ^' ■ Consequently, 
{'K2{e),N)^E. 

Now, we prove the other implication. If there is such compatible E, then, 

by Lemma 12, (/[Si.Sa] U /(Si.Sa)) n 0[Si,S2] = 0- Assume /(s^.s^) n 0((Si,S2)) 7^ 
0, and let e G I{Si,S2) ^ 0((Si,S2))- Since e G I{Si,S2)^ we have, by condition 
(1) va. Lemma 12, e G E, and we have by condition (2) {'jT2{e),N) G E. Now 
since e G 0((Si,s'2)) have a contradiction by condition (^5^. Finally, assume 
^2(/(Si,S2)) n 7r2(/[Si,S2] 1^ 0((Si,s2))) 7^ and let x G 7r2(/(Si,S2)) ^ t^2{I[Si,S2] ^ 
0((Si,S2)))- Then, by condition (2), {x, N) G E, and by condition (5^, (a;, N) ^ E 
- a contradiction. 

By Lemma 12, we have in this case {I[Si,S2] ^ ^(81,82)) ^ ^{81, 82) — since 

I[8^,82] U /(Si,S2) C £ and ^ n 0(5i,S2) = 0- ' ' ' □ 
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Recall that I (s 1,82) = ^{{81,82))^ hence the third equation of Lemma 15 may 
be rephrased more symmetrically as "/((^j^g^)) ^((Si,S2)) = ■ Notice that 
the case NGiV{Si) U ¥{82)) = (roughly) corresponds to the situation where 
the original graph F that generates G has a connected component N — N. 
In this case, by Lemma 15, there is a compatible E C L x L fov {81,82) iff 
-f((Si,S2)) nO((Si,S2)) = (since %,s,] = 0[s^,S2] = 0)- 

Example 16. Wc continue Example 14 (our running example). Recall that /[^-^ g^lU 
-f(Si,S2) = {{a,h), {b,a), (c,c)} and 0[s,,S2] = {{b,b), (c,6)} - hence they arc dis- 
joint. Also, 772 (/(s^^Ss)) ^ and 7r2(/[Si,s2]nO((Si,S2))) = T^2{{{a,b)}) = {b}, 
and therefore they arc disjoint. Finally, /(g^ ^^) ^^^((51,52)) = ^- Consequently, 
by Lemma 15, there is a compatible E for (5*1, 52) - such an E is given in 
Example 14. 

6 Set of touching graphs 

Let S = {8i I i G {l,...,n}} be a set of mutually isomorphic and disjoint 
subgraphs of G. In this section we turn to the question of whether or not there 
is an C L X L and a linear ordering C = {8i^ , , . . . , Si^ ) of <S such that E 
is a compatible embedding relation for C. 

The following result is easily obtained from Lemma 5. 

Lemma 17. Let G be a graph, E C LxL, and C = {81, . . . , 5„) be a sequence of 
mutually disjoint induced subgraphs of G isomorphic to 8. Then E is compatible 

for C iff (1) /[Si,...,s„] C £^ C L'^\{p\^Si,...,Sn\) "'^^ (^) Z'^'" touching 
8i and 8j with i < j, we have that the first three conditions of Lemma 12 hold 
w.r.t. E and {8i,8j). 

Clearly, if 8i and 8i+i are non-touching, then E is compatible for (5i, . . . , 
8i, 8i+i, . . . , Sn) iff E is compatible for (5i, . . . , 5'i+i, 8i, . . . , Sn). Thus, as we 
have already seen in Section 4, the case where 81, 82, ■ ■ ■ , Sn arc mutually non- 
touching is much less involved: E is compatible for each linear ordering of S. 
For touching graphs, the situation is different as the conditions in Lemma 12 are 
not symmetric: e.g. I(Si,8j) 9'iid /(g^. 5.) generally differ. Hence, wc must choose 
a linear ordering in a "compatible" way. First, we focus on the question whether 
or not there exists an E compatible for a given linear ordering C of S. We 
characterize the existence by generalizing Lemma 15 for the case where more 
than two graphs can touch each other. To this aim consider the following graph 
that represents whether or not subgraphs 5, and 8j in <S touch. 

Definition 18. Let G be a graph and <S = {8i \ i G {1, • . . ,n}} be a set of 
induced subgraphs of G. The touching graph of G w.r.t. S, is the (undirected) 
graph {S, {{8i, 8j} \ 8i and 8j touch}). 

We now give the edges of a touching graph an orientation such that the 
obtained graph, called directed touching graph, is acyclic. 
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Definition 19. Let T be the touching graph of G w.r.t. S. Then the directed 
touching graph of G w.r.t. to an ordering {Si, 82, • ■ • , Sn) of S is the directed 
graph D = {V{D), E{D)) where, V{D) = V{T) and {Si, Sj) e E{D) iff {Si, Sj} G 
E{T) and i < j. 

For e = {Si,S-j) € we write Oe = 0(^Si,Sj) and 0(e) = 0((Si,Sj)) (and 

similarly for the insets le and /(e))- 

We now obtain the main result - it generalizes Lemma 15. 

Theorem 20. Let G be a graph and G — {Si, . . . , Sn) be a sequence of induced 
subgraphs of G and let D be the directed touching graph of G w.r.t. G. There is 
a compatible E C L x L for C iff 



Moreover, if this is the case, then (/[Si,...,s„] U {Ue^E{D)Ie)) n {^eeE(D)Oe) = 0- 

Proof. This proof will be in the same spirit as the proof of Lemma 15. 

Assume first that that right-hand side holds. Then take E' = /[Si,...,s„] U 

{UeeE{D)Ie), and take F' = TT2{y^eeE{D)h)- Now, let E = E'\j{{x, N) \ x € F']. 
By Lemma 17 it suffices to show that for each two touching Si and Sj with 
i < j, the first three conditions of Lemma 12 hold w.r.t. E and r = {Si,Sj). 
Now, conditions (1), (2), and (4) of Lemma 12 hold trivially. Finally to prove 
condition (3), we need to show that e S O(^) fl E implies (7r2(e),iV) ^ E. Let 
e e 0(r) n.B. We have, by definition of E, that e e I[Sk^,Sk2\ or e e I(Sk3,Sk^) for 
some ki, . . . ,k4. The latter is a contradiction of I(s^,^^SkJ '~lO(r) = 0. The former 
implies by the second equation of this theorem that 7r2(e) ^ T^2{'^eeE{D)Ie) = F'- 
Consequently, (7r2(e),A'') ^ E. 

Now, we prove the other implication. Assume that there is a compatible 
E C L X L ioi G. Then by Lemma 17, (1) I[s,,...,s„] ^ E <Z L^\{O^Si,...,Sn]) 
and (2) for each two touching Si and Sj with i < j, we have that the first 
three conditions of Lemma 12 hold w.r.t. E and {Si, Sj). Hence, by Lemma 12, 
iI[Su-,Sn] U {UeeE(D)Ie)) H 0[s„...,Sn] = 0- Assume now that If, n 0(/,) ^ 
for some /i,/2 G E{D), and let e G Ij\ n 0(/2)- Since e G //j, we have, by 
condition (1) in Lemma 12, e E E, and we have by condition (2) {TT2{e),N) G E. 
Now since e G 0(j2) we have a contradiction by condition (3). Finally, assume 

7r2(//J n7r2(/[si,...,s„] n 0(/,)) ^ and let x G 7r2(//J n 7r2(/[5i,...,5„] n 0(/,)). 
Then, by condition (2), {x,N) G E, and by condition (3), {x,N) ^ E - a 
contradiction. 

Finally, by Lemma 15, if this is the case, then {I[Si,...,s„] U (.^eGE{D)Ie)) H 



(%i,...,S„] U (Ue6E(B)/e)) H 0[Si,...,S„] = 0, 

7r2(Uee_E(D)4) n 7r2(/[Si,...,s„] n (Uee£;(D)0(e))) = 0, and 

{^eeE{D)Ie) H (UeeB(B) 0(e) ) = 0- 



(1) 

(2) 
(3) 



{^eeE{D)Oe) = 0- 



□ 
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7 Determining compatible sequences of subgraphs 



In this section wc turn to the question of efficiently determining, given a set 
S = {Si I t e {1, . . . , n}} of disjoint subgraphs, an ordering C of 5 (if it exists) 
such that there is a compatible E G L x L for C. 

We proceed as follows. First, assuming such ordering C exists, by Theorem 20, 
the following equality 

(%,... U (UeeB(Z3)/e)) H (0[5i . .,S„] U (Ue6£(Z5) Oe ) ) = (4) 

holds, where D is the directed touching graph w.r.t. C. We consider this equality 
instead of (/[Si,...,s„] U {^ei£E{D)Ie))(^0[Si,...,s„] = for computational efficiency 
reasons, as wc will sec below. 

Note that, because of distributivity {A U B) n C = {A n C) U {B n C), 
Equation (4) is equal to 

I U (7enO/))u[ U ((7[s„...,S„]nOe)U(7enO[s„...,S„])) 
\eJeE{D) J \eeE{D) 

U(/[Si,...,s„] nO[Sl,...,s„])■ 
Now, for touching graphs Si and 5*2, wc define e = {Si,S2) admissible (w.r.t. 
S) if (7[Si,...,S„l n Oe) U (/e n 0[Si,...,s„]) = 0- Or equivalently, 

I[S^,...,S„] n Oe = and le H 0[Si,...,S„] = 0- 

Now, to determine the existence of an ordering C of <S and a E C Lx L such that 
E is compatible for C, we first check whether or not I[Si,...,s^] ^ O^Si,...,s^] = 0- 
If this does not hold, there is no such C (and E). Otherwise, we construct the 
admissible touching graph. 

Definition 21. Let T be the touching graph of G w.r.t. S. Then the admissible 
touching graph of G w.r.t. <S is the directed graph D = {V{D), E{D)) where, 
V{D) = V{T) and, for e = {S, S') with S, S' e V{D), e e E{D) iff e is admissi- 
ble. 

Now, since wc consider Equation (4), this graph will be considerably smaller 
than the corresponding graph for the original equation. This may correspond to 
a substantial speedup as we subsequently check conditions between edges of this 
graph. 

Recall that Of C 0(f), and hence nO(j) = implies nO/ = 0. Thus we 
need to check for each topological ordering C of the admissible touching graph 
whether or not 

U (7enO(/)) = 0, and 

e,f(^E{D) 

y (7r2(7e) n 7r2(7[s„...,S„] n 0(/))) = 

e,f(iE{D) 
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where D is the directed touching graph w.r.t. C. If there is such a C, then there is 
an E compatible for C. Otherwise, there is no hnear ordering C and embedding 
relation E where E is compatible for C. 

8 Discussion 

In this paper we considered the problem of graph grammar inference for the case 
where one is given a disjoint set S of isomorphic subgraphs to be generated by a 
single rule r = N ^ S/E, where the embedding relation E is allowed to contain 
tuples containing N. In this way we generalize results in [1]. This result is to 
be seen as a further step towards a systematic account of NLC graph grammar 
inference. 

Formally, we characterized, given a <S = {Si \ i G {l,...,n}}, the existence of 
an ordering C of <S and a. E G L x L such that E is compatible for C. Moreover, 
if such a C exists, then it is shown to be a topological ordering of a suitable 
graph that identifies admissible pairs of touching subgraphs. The efficiency of 
the proposed algorithm depends significantly on the cardinality of 5 - for small 
S the algorithm seems feasible, however this has yet to be verified in practice. 

Finding a graph S, such that the set <S of subgraphs of G isomorphic to S 
is (1) "compressible", i.e. there is a compatible embedding relation for suitable 
ordering of S, and (2) optimal (either in cardinality, or in some other measure) 
remains to be investigated. 

Also, it is natural to consider the case where for rule r = N ^ S/E, N 
is allowed to be a label on a nodes of S instead of N contained in (tuples of) 
E. This would have the consequence that an infinite number of graphs can be 
generated by r, and, moreover, multiple copies of S can overlap - loosening the 
restriction of disjointness considered here. 
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